[insert boilerplate from SOW and/or Tech Memo; expand a bit on salmonIPM…]
The chum life cycle begins with spawning, egg incubation and fry emergence, and shortly thereafter the downstream migration of juveniles (which we refer to as smolts). In salmonIPM we can fit three alternative spawner-recruit functions to describe the expected relationship between spawner abundance \(S_{jt}\) and smolt abundance \(M_{jt}\) in brood year \(t\) in population \(j\): density-independent discrete exponential, Beverton-Holt, and Ricker:
\[ f \left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right) = \begin{cases} \alpha_{jt} S_{jt} & \text{exponential} \\ \dfrac{ \alpha_{jt} S_{jt} }{ 1 + \dfrac{ \alpha_{jt} S_{jt} }{ A_{jt} M_{\text{max},j} }} & \text{Beverton-Holt} \\ \alpha_{jt} S_{jt} \text{exp}\left(- \dfrac{ \alpha_{jt}S_{jt} }{ \text{exp}(1) A_{jt} M_{\text{max},j} } \right) & \text{Ricker} \end{cases} \]
We use a nonstandard parameterization of the Ricker in terms of maximum smolt production or “capacity” \(M_{\text{max}}\), corresponding to the mode of the function. This facilitates direct comparison with the Beverton-Holt (where \(M_{\text{max}}\) is the asymptote) and is better-identified by data than the standard parameterization based on per capita density dependence. \(M_{\text{max}}\) has units of density (fish per stream distance or area) and is then expanded to units of abundance based on habitat size \(A_{jt}\). In the case of Lower Columbia chum, this is km of spawning habitat estimated using GIS.
Intrinsic productivity \(\alpha_{jt}\) is calculated as a weighted mean of age-specific female fecundity \(\mu_{E,a}\), weighted by the spawner age distribution \(q_{jta}\), multiplied by the proportion of female spawners \(q_{\text{F},jt}\) (discounted for the proportion of females that are not “green”, i.e. not fully fecund, with discount rate \(\delta_\text{NG} \in [0,1]\)), and finally multiplied by the density-independent maximum egg-to-smolt survival \(\psi_j\):
\[ \alpha_{jt} = \psi_j q_{F,jt} \left[p_{\text{G},jt} + \delta_\text{NG} \left(1 - p_{\text{G},jt} \right) \right] \sum_{a=3}^{5} q_{jta} \mu_{E,a}. \]
The discount for partially spawned females is only relevant for one population, Duncan Channel, a constructed spawning channel in which some translocated females are believed to have already deposited eggs elsewhere. We assume the proportion of “green” females \(p_{\text{G},jt}\) is known without error since translocated spawners are individually handled and visually assessed.
This formulation implicitly assumes egg deposition is density-independent while egg-to-smolt survival is density-dependent, resulting in realized survival \(< \psi_j\) as spawner density increases. Maximum egg-to-smolt survival varies randomly among populations according to the hyperdistribution
\[ \text{logit}(\psi_j) \sim N(\mu_\psi, \sigma_\psi). \]
Maximum smolt production varies randomly among populations according to the hyperdistribution
\[ \text{log}(M_{\text{max},j}) \sim N(\mu_{M_\text{max}}, \sigma_{M_\text{max}}). \]
Given the expected smolt production, realized smolt production (the unknown “true state”) is lognormally distributed with process errors that include a common ESU-level autoregressive trend \(\eta^\text{year}_{M,t}\) with first-order autocorrelation coefficient \(\rho_{M}\) and innovation SD \(\sigma^\text{year}_{M}\), plus unique independent shocks \(\epsilon_{M,jt}\):
\[ \begin{aligned} M_{jt} &= f \left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right) \, \text{exp}( \eta^\text{year}_{M,t} + \epsilon_{M,jt} ) \\ \eta^\text{year}_{M,t} &\sim N(\rho_{M} \eta^\text{year}_{M,t-1}, \sigma^\text{year}_{M}) \\ \epsilon_{M,jt} &\sim N(0, \sigma_{M}). \end{aligned} \]
In salmonIPM we can include covariates of the parameters \(\psi_j\) and \(M_{\text{max},j}\) as well as the smolt recruitment process error term. We do not consider this option further here, but we anticipate incorporating environmental covariates in future development of the Lower Columbia chum IPM.
Smolt-to-adult survival (SAR, \(s_{MS}\)) is assumed to be density-independent. The SAR process model for outmigrant cohort \(t\) in population \(j\) is logistic normal, including a common ESU-level first-order autoregressive trend \(\eta^\text{year}_{MS,t}\) with first-order autocorrelation coefficient \(\rho_{MS}\) and innovation SD \(\sigma^\text{year}_{MS}\), plus unique independent shocks \(\epsilon_{MS,jt}\):
\[ \begin{aligned} \text{logit}( s_{MS,jt} ) &= \text{logit}( \mu_{MS} ) + \eta^\text{year}_{MS,t} + \epsilon_{MS,jt} \\ \eta^\text{year}_{MS,t} &\sim N(\rho_{MS} \eta^\text{year}_{MS,t-1}, \sigma^\text{year}_{MS}) \\ \epsilon_{MS,jt} &\sim N(0, \sigma_{MS}). \end{aligned} \] As with the smolt recruitment parameters, salmonIPM can accommodate spatiotemporally varying covariates of \(s_{MS}\), but we do not discuss this further here.
Adult age structure is modeled by defining a vector of conditional probabilities, \(\mathbf{p}_{jt} = [p_{3jt}, p_{4jt}, p_{5jt}] ^ \top\), where \(p_{ajt}\) is the probability that an outmigrant in year \(t\) in population \(j\) returns at age \(a\), given that it survives to adulthood. The unconditional probability is given by \(s_{MS,jt} p_{ajt}\), where both SAR and \(p_a\) are functions of underlying annual marine survival and maturation probabilities that are nonidentifiable without some ancillary data. This parameterization resolves the nonidentifiability.
The conditional age probabilities follow a logistic normal process model with hierarchical structure across populations and through time within each population. The additive log ratio,
\[ \text{alr}(\mathbf{p_{jt}}) = \left[ \text{log} \left( \dfrac{p_{3jt}}{p_{5jt}} \right), \text{log} \left( \dfrac{p_{4jt}}{p_{5jt}} \right) \right] ^ \top \]
has a hierarchical bivariate normal distribution with population-level random effects \(\boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j}\) and unique residuals \(\boldsymbol{\epsilon}_{\mathbf{p}, jt}\):
\[ \begin{aligned} \text{alr}(\mathbf{p_{jt}}) &= \text{alr}(\boldsymbol{\mu}_\mathbf{p}) + \boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j} + \boldsymbol{\epsilon}_{\mathbf{p}, jt} \\ \boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j} &\sim N(\mathbf{0}, \boldsymbol{\Sigma}^\text{pop}_\mathbf{p}) \\ \boldsymbol{\epsilon}_{\mathbf{p}, jt} &\sim N(\mathbf{0}, \boldsymbol{\Sigma}_\mathbf{p}). \end{aligned} \]
Here the 2 \(\times\) 2 covariance matrices \(\boldsymbol{\Sigma}^\text{pop}_\mathbf{p}\) and \(\boldsymbol{\Sigma}_\mathbf{p}\) allow correlated variation among age classes (on the unconstrained scale, not merely due to the simplex constraint on \(\mathbf{p}\)) across populations and through time within a population, respectively. For example, some populations or cohorts may skew overall younger or older than average. We parameterize each covariance matrix by a vector of standard deviations and a correlation matrix:
\[ \begin{aligned} \boldsymbol{\Sigma}^\text{pop}_\mathbf{p} &= \boldsymbol{\sigma}^\text{pop}_\mathbf{p} \mathbf{R}_\mathbf{p}^\text{pop} { \boldsymbol{\sigma}^\text{pop}_\mathbf{p} } ^ \top \\ \boldsymbol{\Sigma}_\mathbf{p} &= \boldsymbol{\sigma}_\mathbf{p} \mathbf{R}_\mathbf{p} \boldsymbol{\sigma}_\mathbf{p} ^ \top . \end{aligned} \]
Adult sex structure is modeled as the conditional probability \(p_{\text{F},jt}\) that an outmigrant from population \(j\) in year \(t\) is female, given that it survives to adulthood. The proportion of females follows a process model that includes normally distributed population-specific random effects \(\eta^\text{pop}_{\text{F}}\) with hyper-SD \(\sigma^\text{pop}_\text{F}\) and unique residuals \(\epsilon^\text{pop}_{\text{F}}\) with SD \(\sigma_\text{F}\) around the hyper-mean \(\mu_\text{F}\).
\[ \begin{aligned} \text{logit}( p_{\text{F},jt} ) &= \text{logit}( \mu_\text{F} ) + \eta^\text{pop}_{\text{F},j} + \epsilon_{\text{F},jt} \\ \eta^\text{pop}_{\text{F},j} &\sim N(0, \sigma^\text{pop}_\text{F}) \\ \epsilon_{\text{F},jt} &\sim N(0, \sigma_\text{F}) \end{aligned} \]
Recruitment of wild-origin adult spawners \(S_\text{W}\) is calculated by summing over the corresponding outmigrant cohorts and subtracting the number of spawners removed for hatchery broodstock or to spawning channels (\(B_{jt}\), assumed known without error). Fishery mortality of Lower Columbia chum is thought to be minimal.
\[ S_{\text{W}, jt} = \left(\sum_{a=3}^{5} s_{MS,j,t-a} \hspace{0.1cm} p_{aj,t-a} \hspace{0.1cm} M_{j,t-a} \right) - B_{jt} = \left(\sum_{a=3}^{5} \tilde{S}_{\text{W}, ajt} \right) - B_{jt}. \]
The proportion of naturally spawning hatchery-origin adults is modeled as an independent parameter in each population and year when hatchery programs were operational in the basin or when observed origin-frequency data indicate hatchery spawners were present. That is, we do not include a process model linking \(p_{\text{HOS},jt}\) to hatchery releases and returns, although this is planned as a component of future model development. Abundance of hatchery-origin spawners is then calculated as
\[ S_{\text{H},jt} = \dfrac{ S_{\text{W},jt} \hspace{0.1cm} p_{\text{HOS},jt} } { (1 - p_{\text{HOS},jt}) }. \]
Total spawner abundance is then
\[ S_{jt} = S_{\text{W},jt} + S_{\text{H},jt} + S^\text{add}_{jt} \] where \(S^\text{add}_{jt}\) is the number of spawners that returned to populations \(i \neq j\) and were translocated into population \(j\). This is only nonzero in the case of Duncan Channel.
Spawner age structure is given by \(\mathbf{q}_{jt} = [q_{3jt}, q_{4jt}, q_{5jt}]\), where \(q_{ajt} = \tilde{S}_{\text{W},ajt} / \sum_a{\tilde{S}_{\text{W},ajt}}\). Spawner sex structure is given by the age-weighted average of female proportions from the respective outmigrant cohorts: \(q_{\text{F},jt} = \sum_{a} {q_{ajt} \hspace{0.1cm} p_{\text{F},j,t-a}}\).
We modeled observations of fecundity from individual female chum salmon collected at hatcheries. The observation likelihood for the fecundity of female \(i\) of age \(a\) is a zero-truncated normal with age-specific mean and SD:
\[ E_{a,i}^\text{obs} \sim N_+(\mu_{E,a}, \sigma_{E,a}). \]
Empirical estimates of smolt and spawner abundance come from monitoring programs that use various sampling methods (e.g., rotary screw traps for outmigrants and weirs or mark-recapture for adults) and statistical models (e.g., the BTSPAS R package or other Bayesian time-series models) to produce a posterior distribution of the abundance of a given life-history stage in each population and year. These methods have previously been applied and their output summarized by the posterior mean and SD. We used the posterior moments of abundance to compute the log-mean \(\mu_{K,jt}\) and log-SD \(\tau_{K,jt}\) for each life stage \(K\) assuming the posterior is lognormal, which appears to be reasonable in general. We then used the observation-model posteriors as informative priors on the corresponding state variables in the IPM.
One complication is that outmigrant monitoring sites do not always have a one-to-one correspondence with independent populations. Specifically, the smolt trap in the mainstem Grays River intercepts juveniles produced in the mainstem population itself as well as the upstream West Fork Grays River and Crazy Johnson Creek populations, the latter of which is also sampled with a dedicated trap. To make the states comparable to the observations, we defined a derived state \(\tilde{M}_{jt}\) that is the sum of smolts produced in population \(j\) and all upstream populations, assuming no mortality.
\[ \begin{aligned} \text{log}(\tilde{M}_{jt}) &\sim N(\mu_{\tilde{M},jt}, \tau_{\tilde{M},jt}) \\ \text{log}(S_{jt}) &\sim N(\mu_{S,jt}, \tau_{S,jt}). \end{aligned} \]
The IPM framework naturally handles missing observations, but in some cases an estimate of the observation prior log-mean \(\mu_{K,jt}\) was available while the observation error log-SD \(\tau_{K,jt}\) was missing or unknown. These missing values were imputed within the IPM by fitting a lognormal hyperdistribution to the known log-SDs:
\[ \begin{aligned} \text{log}(\tau_{M,ij}) &\sim N( \mu_{\tau_M}, \sigma_{\tau_M}) \\ \text{log}(\tau_{S,ij}) &\sim N( \mu_{\tau_S}, \sigma_{\tau_S}). \end{aligned} \]
Spawner age-composition data are collected from scale samples in each population and year (the IPM framework naturally handles cases in which no age samples are available). Age frequencies of wild spawners \(\mathbf{n}_{ajt}^\text{obs} = [n_{3jt}^\text{obs}, n_{4jt}^\text{obs}, n_{5jt}^\text{obs}] ^\top\) are assumed to follow a multinomial observation likelihood with expected proportions given by the true state:
\[ \mathbf{n}_{ajt}^\text{obs} \sim \text{Multinomial} \left( \sum_{a=3}^5 n_{ajt}^\text{obs}, \mathbf{q}_{jt} \right). \]
Frequencies of wild- and hatchery-origin spawners are estimated based on otolith marking or genetic parentage-based tagging (PBT), and are assumed to follow a binomial observation likelihood given the total sample size assigned to rearing origin:
\[ n_{\text{H},jt}^\text{obs} \sim \text{Bin} \left( n_{\text{W},jt}^\text{obs} + n_{\text{H},jt}^\text{obs}, p_{\text{HOS},jt} \right). \]
The observed frequency of female spawners is modeled with a binomial observation likelihood given the total sample size assigned to sex:
\[ n_{\text{F},jt}^\text{obs} \sim \text{Bin} \left( n_{\text{M},jt}^\text{obs} + n_{\text{F},jt}^\text{obs}, q_{\text{F},jt} \right). \]
To complete the model specification, we need priors on all top-level or hyperparameters as well as initial states that are not generated by the process model. In the list of parameter priors below, the power-exponential (also known as generalized normal or Subbotin) density
\[ \text{PowerExp}(x | u,s,r) = \dfrac{r}{2 s \Gamma(1/r)} \text{exp} \left[ - \left( \dfrac{|x - u|}{s} \right)^r \right] \]
is used to regularize autocorrelation coefficients away from the computationally problematic boundaries 1 and -1. The Lewandowski, Kurowicka and Joe (2009) distribution with density \(\text{LKJ}(\eta)\), is a distribution over correlation matrices; when \(\eta = 1\) it is uniform. The prior mean and SD of log-mean smolt capacity, \(m_{M_\text{max}}\) and \(s_{M_\text{max}}\), are the sample 90th percentile and SD of observed log smolt density, respectively. While this data-aware prior technically uses the data twice, it is broad and weakly informative and merely serves to find a reasonable scale for overall population size, similar to widely used Bayesian inference packages.
\[ \begin{aligned} \mu_{E,a} &\sim N_+(2500,500) \\ \sigma_{E,a} &\sim N_+(500,1000) \\ \delta_{\text{NG}} &\sim U(0,1) \\ \mu_\psi &\sim U(0,1) \\ \sigma_\psi &\sim N_+(0,1) \\ \mu_{M_\text{max}} &\sim N(m_{M_\text{max}}, s_{M_\text{max}}) \\ \sigma_{M_\text{max}} &\sim N_+(0,3) \\ \rho_M &\sim \text{PowerExp}(0,0.85,20) \\ \sigma^\text{year}_M &\sim N_+(0,2) \\ \sigma_M &\sim N_+(0,2) \\ \mu_{MS} &\sim U(0,1) \\ \rho_{MS} &\sim \text{PowerExp}(0,0.85,20) \\ \sigma^\text{year}_{MS} &\sim N_+(0,2) \\ \sigma_{MS} &\sim N_+(0,2) \\ \boldsymbol{\sigma}^\text{pop}_\mathbf{p} &\sim N_+(0,2) \\ \boldsymbol{\sigma}_\mathbf{p} &\sim N_+(0,2) \\ \mathbf{R}_\mathbf{p}^\text{pop} &\sim \text{LKJ}(1) \\ \mathbf{R}_\mathbf{p} &\sim \text{LKJ}(1) \\ \mu_\text{F} &\sim U(0,1) \\ \sigma^\text{pop}_\text{F} &\sim N_+(0,2) \\ \sigma_\text{F} &\sim N_+(0,2) \\ \mu_{\tau_M} &\sim N(0,1) \\ \sigma_{\tau_M} &\sim N_+(0,5) \\ \mu_{\tau_S} &\sim N(0,1) \\ \sigma_{\tau_S} &\sim N_+(0,5) \end{aligned} \]
Finally, lognormal priors on initial smolt abundance (in year 1 of each population time series) and spawner abundance (in years 1-3) are also data-aware but weakly informative, with parameters equal to the sample log-mean and 2 times the sample log-SD of observed abundance, respectively. The prior on spawner age composition in years 1-3 is simplex-uniform over the space of “orphan” age classes, i.e. those too old to have parent brood years included in the process model. The proportion of females in the same orphan age classes is given a \(\text{Beta}(3,3)\) prior to weakly regularize toward an equal sex ratio.
User inputs needed to fit the Lower Columbia chum IPM include, first, a data frame containing observations and associated information organized by population and year. We can look at the subset of rows of fish_data_SMS for the Crazy Johnson Creek population (coded Grays CJ) to see what column names and formats salmonIPM expects:
| pop | year | A | S_obs | tau_S_obs | M_obs | tau_M_obs | downstream_trap | n_age3_obs | n_age4_obs | n_age5_obs | n_H_obs | n_W_obs | n_M_obs | n_F_obs | p_G_obs | fit_p_HOS | B_take_obs | S_add_obs | F_rate |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Grays CJ | 2004 | 0.6 | 1051 | 0.16 | NA | NA | 181 | 4 | 26 | 1 | 0 | 31 | 22 | 9 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2005 | 0.6 | 1418 | 0.03 | NA | NA | 182 | 56 | 101 | 40 | 12 | 197 | 85 | 124 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2006 | 0.6 | 3819 | 0.05 | NA | NA | 183 | 65 | 250 | 9 | 13 | 324 | 176 | 161 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2007 | 0.6 | 870 | 0.13 | NA | NA | 184 | 12 | 77 | 14 | 4 | 103 | 55 | 52 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2008 | 0.6 | 1093 | 0.13 | NA | NA | 185 | 45 | 78 | 4 | 13 | 127 | 57 | 83 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2009 | 0.6 | 996 | 0.03 | NA | NA | 186 | 74 | 26 | 2 | 3 | 102 | 47 | 58 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2010 | 0.6 | 865 | 0.13 | NA | NA | 187 | 11 | 65 | 0 | 2 | 76 | 39 | 39 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2011 | 0.6 | 2304 | 0.12 | NA | NA | 188 | 11 | 184 | 4 | 16 | 199 | 110 | 105 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2012 | 0.6 | 3475 | 0.09 | 576450 | 0.24 | 189 | 21 | 184 | 34 | 8 | 239 | 177 | 70 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2013 | 0.6 | 1925 | 0.06 | 1466141 | 0.04 | 190 | 112 | 106 | 25 | 19 | 244 | 148 | 115 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2014 | 0.6 | 1541 | 0.04 | 1101634 | 0.10 | 191 | 37 | 165 | 11 | 25 | 214 | 151 | 88 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2015 | 0.6 | 4193 | 0.08 | 419369 | 0.14 | 192 | 114 | 170 | 14 | 26 | 298 | 165 | 159 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2016 | 0.6 | 5987 | 0.06 | 1155179 | 0.06 | 193 | 31 | 236 | 26 | 10 | 294 | 165 | 139 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2017 | 1.0 | 3681 | 0.15 | 544785 | 0.07 | 194 | 96 | 195 | 79 | 25 | 371 | 206 | 190 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2018 | 1.0 | 899 | 0.02 | 1259033 | 0.07 | 195 | 122 | 46 | 4 | 8 | 172 | 102 | 78 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2019 | 1.0 | 72 | 1.17 | 276500 | 0.36 | 196 | 3 | 11 | 0 | 0 | 14 | 10 | 4 | 1 | 1 | 0 | 0 | 0 |
| Grays CJ | 2020 | 1.0 | NA | NA | 8189 | 0.10 | 197 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
Column names generally follow the mathematical notation; fit_p_HOS is a logical or numeric variable indicating whether \(p_{\text{HOS},jt}\) should be estimated or set to zero, and downstream_trap gives the row index corresponding to the downstream smolt trap that captures outmigrants from the given pop and year. (In this case, Grays CJ smolts are also counted at a local trap.)
Second, fecundity_data contains observations of individual fecundity indexed by hatchery female. Here are the first few rows (year and ID are not actually used in modeling):
| year | ID | age_E | E_obs |
|---|---|---|---|
| 2009 | F-1 | 3 | 2322 |
| 2009 | F-6 | 3 | 2481 |
| 2009 | F-13 | 3 | 2930 |
| 2009 | F-19 | 3 | 2817 |
| 2009 | F-20 | 3 | 2905 |
| 2009 | F-22 | 3 | 1803 |
| 2009 | F-29 | 3 | 2284 |
| 2009 | F-32 | 3 | 2586 |
| 2009 | F-35 | 3 | 1855 |
| 2009 | F-42 | 3 | 2533 |
We can now call the function salmonIPM() to fit models to the historical Lower Columbia chum monitoring data. Here we use the Ricker spawner-recruit function. This model takes approximately 3 hr to fit on a multicore 3-GHz Intel processor.
LCRchum_Ricker <- salmonIPM(fish_data = fish_data_SMS, fecundity_data = fecundity_data,
ages = list(M = 1), stan_model = "IPM_LCRchum_pp", SR_fun = "Ricker",
log_lik = TRUE, chains = 3, iter = 1500, warmup = 500,
control = list(adapt_delta = 0.99, max_treedepth = 14))
print(LCRchum_Ricker, prob = c(0.05,0.5,0.95),
pars = c("psi","Mmax","eta_year_M","eta_year_MS","eta_pop_p","mu_pop_alr_p","p","p_F",
"tau_M","tau_S","p_HOS","B_rate","E_hat","M","S","s_MS","q","q_F","LL"),
include = FALSE, use_cache = FALSE)
Inference for Stan model: IPM_LCRchum_pp.
3 chains, each with iter=1500; warmup=500; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=3000.
mean se_mean sd 5% 50% 95% n_eff Rhat
mu_E[1] 2591.57 0.64 44.93 2518.44 2591.13 2667.22 4936 1.00
mu_E[2] 2856.96 0.30 24.60 2817.14 2856.61 2897.67 6685 1.00
mu_E[3] 2868.99 1.11 73.02 2747.91 2869.97 2985.28 4357 1.00
sigma_E[1] 510.29 0.48 33.22 457.73 509.36 566.74 4833 1.00
sigma_E[2] 560.74 0.28 17.22 533.23 560.26 590.04 3829 1.00
sigma_E[3] 435.38 0.93 55.47 355.67 429.31 534.84 3584 1.00
delta_NG 0.57 0.01 0.24 0.17 0.58 0.95 1626 1.00
mu_psi 0.59 0.00 0.08 0.46 0.58 0.73 668 1.01
sigma_psi 0.40 0.01 0.26 0.05 0.37 0.87 756 1.00
mu_Mmax 7.28 0.02 0.58 6.44 7.24 8.27 675 1.00
sigma_Mmax 1.31 0.01 0.47 0.75 1.21 2.17 1016 1.00
rho_M 0.10 0.02 0.43 -0.64 0.13 0.75 519 1.00
sigma_year_M 0.46 0.00 0.12 0.30 0.45 0.68 1130 1.00
sigma_M 0.30 0.00 0.05 0.22 0.29 0.37 707 1.00
mu_MS 0.00 0.00 0.00 0.00 0.00 0.00 1515 1.00
rho_MS 0.49 0.01 0.22 0.10 0.52 0.80 717 1.00
sigma_year_MS 1.03 0.01 0.22 0.73 1.00 1.43 1017 1.00
sigma_MS 0.56 0.00 0.05 0.47 0.56 0.66 627 1.01
mu_p[1] 0.23 0.00 0.02 0.20 0.23 0.27 461 1.01
mu_p[2] 0.72 0.00 0.02 0.69 0.72 0.75 549 1.01
mu_p[3] 0.04 0.00 0.01 0.03 0.04 0.05 444 1.00
sigma_pop_p[1] 0.20 0.01 0.17 0.02 0.16 0.53 290 1.02
sigma_pop_p[2] 0.14 0.01 0.12 0.01 0.11 0.37 332 1.01
R_pop_p[1,1] 1.00 NaN 0.00 1.00 1.00 1.00 NaN NaN
R_pop_p[1,2] 0.35 0.03 0.58 -0.78 0.53 0.98 529 1.00
R_pop_p[2,1] 0.35 0.03 0.58 -0.78 0.53 0.98 529 1.00
R_pop_p[2,2] 1.00 0.00 0.00 1.00 1.00 1.00 2848 1.00
sigma_p[1] 1.70 0.01 0.14 1.48 1.69 1.94 509 1.01
sigma_p[2] 0.87 0.00 0.09 0.72 0.86 1.03 565 1.01
R_p[1,1] 1.00 NaN 0.00 1.00 1.00 1.00 NaN NaN
R_p[1,2] 0.75 0.00 0.06 0.64 0.76 0.85 695 1.01
R_p[2,1] 0.75 0.00 0.06 0.64 0.76 0.85 695 1.01
R_p[2,2] 1.00 0.00 0.00 1.00 1.00 1.00 3018 1.00
mu_F 0.50 0.00 0.02 0.47 0.50 0.52 1000 1.00
sigma_pop_F 0.19 0.00 0.07 0.09 0.18 0.31 770 1.00
sigma_F 0.38 0.00 0.04 0.32 0.37 0.44 1112 1.00
mu_tau_M 0.08 0.00 0.01 0.06 0.08 0.10 3418 1.00
sigma_tau_M 1.13 0.00 0.12 0.96 1.13 1.34 3193 1.00
mu_tau_S 0.11 0.00 0.01 0.10 0.11 0.12 2642 1.00
sigma_tau_S 0.98 0.00 0.06 0.89 0.98 1.08 2966 1.00
lp__ -41678.71 1.36 38.84 -41745.68 -41677.70 -41614.36 812 1.01
Samples were drawn using NUTS(diag_e) at Sun May 09 06:10:01 2021.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
The results (showing only hyperparameters for readability) suggest that the Markov chains mixed adequately, since the Gelman-Rubin \(\widehat{R}\) diagnostics are all ≤ 1.01 and effective sample sizes \(n_\text{eff}\) are sufficient for computing 90% credible intervals. We may get false-positive warnings that some \(\widehat{R}\) and \(n_\text{eff}\) estimates are NaN, but these are for fixed diagonal elements of correlation matrices. We could further explore the chains and HMC diagnostics using visualizations in the shinystan package.
In principle we could fit all three spawner-recruit functions and compare them based on estimated out-of-sample predictive performance using the approximate leave-one-out cross-validation score (LOO; Vehtari et al. 2017). In practice we find that LOO estimates for IPMs fitted to the Lower Columbia chum data are too unstable to be usable (excessive Pareto k-values indicating that some observations are highly influential on the posterior), requiring brute-force cross-validation or customized methods. These are under development, but in the meantime we proceed to use the Ricker model for inference since it is biologically plausible and fits the data reasonably well.
The plot below summarizes posterior distributions of the main hyperparameters and states across the life cycle. In panels C-H, thick lines and shading correspond to the hyper-mean or ESU level and thin lines correspond to each of the 12 populations in the data set. In panels E-H, lines indicate posterior medians and shading indicates 90% posterior credible intervals at the ESU level (uncertainty around population-level curves is omitted for clarity).
There is clear evidence of density-dependence, with more among-population heterogeneity in smolt capacity than in maximum egg-to-smolt survival, but most populations exhibit little if any overcompensation over the range of observed spawner densities (shown by the x-axis rug in E-F). Both freshwater productivity (total smolt process error, G) and SAR are strongly synchronous at the ESU scale. Freshwater productivity is relatively stable except for the sharp drop in 2019, the most recent year of available data. Smolt-to-adult survival, by contrast, shows strong and quasiperiodic fluctuations consistent with large-scale oceanographic or other environmental drivers. Because of this large and as-yet-unexplained process stochasticity, the uncertainty around the final 4 years of SAR estimates (corresponding to cohorts that have not yet fully returned as adults, and thus informed more by the process model than by data) is very wide.
We now examine model fits to data components in more detail.
Fecundity was higher for 4- and 5-year old females than for the youngest age class. Here the histograms show observations and the lines and shading show the posterior median and 95% credible interval, respectively, of the estimated zero-truncated normal density, which fits the data reasonably well.
The next plot shows observed (observation-model posterior medians; points) and estimated spawner abundance for each population. Filled points indicate observations with known error SD (90% intervals are sometimes too narrow to be visible), while the SD for open points is imputed. The solid gray line is the posterior median of the true state from the IPM. Posterior 90% credible intervals indicate process (dark shading) and observation (light shading) uncertainty, i.e. the posterior predictive distribution (PPD) of hypothetical replicate data under the model. Leading or trailing years without spawner abundance estimates are included if smolt abundance is available.
Fits to the smolt abundance data, following the same conventions as for spawner abundance, immediately show how much sparser the juvenile sampling is. Only six populations are monitored with smolt traps (including Grays WF, which along with Grays CJ is incorporated into the summed smolt observations and estimates for the downstream Grays MS trap) and several of those time series are incomplete. For that reason, a greater share of the information about the underlying state comes from the process model, which induces synchronous fluctuations.
To understand how the IPM imputes the abundance observation error SD in cases where it is not reported, we can plot the lognormal hyperdistribution (posterior median and 90% credible interval) fitted to the known SD values (histogram).
Next we compare the estimated true spawner age-frequencies (posterior median and 90% credible interval) to the observed sample proportions (with 90% binomial confidence interval). Age composition varies considerably across populations and through time, tracking fluctuations in cohort strength. The precision of the estimated states follows the sample size, with smaller samples giving relatively more weight to the hierarchical process model.
Plots of the estimated adult sex ratio, following a similar format, likewise show strong interannual fluctuations and some systematic deviations from a 1:1 sex ratio across populations.
Proportion of hatchery-origin spawners was generally low, with some very small samples that cannot definitively rule out higher \(p_\text{HOS}\). In terminal years when spawner data are not yet available but hatcheries are known to be operating in the population, the estimates revert back to the independent \(U(0,1)\) prior. This is not a problem, but will change if we are able to expand the process model to include hatchery releases and returns, which would generate predictions of \(p_{\text{HOS},jt}\).
[need to update this or leave it out for now]
[summary from GitHub Issues, many / most of which are shown in plots above]
[proposed future model development from 2021 SOW]